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ABSTRACT 

A significant part of the call processing software for Lucent* s new PatbStar^ 
access server [FSW981 was checked witb fonnal vcrificatioo techniques. The verification 
system we built for this purpose, naoied FeaVer is accessed via a standard web browser. 
The system maintains a database of feature requirements together with the results of the 
most recently perfonned verifications. Via the browser the user can invoke new verifica- 
tion runs, which are performed in ttie background with the help of a logic model checking 
tool. Requirement violations are reported either as high-level message sequence charts or 
as detailed source-level execution traces of the system source. A main strength of the 
system is its capability to detect potential feature interaction problems at an eariy stage of 
systems design, the type of problem that is difficult to detea with traditional testing tech- 
niques. 

Error reports are typically generated by the system within minutes after a comprehensive 
check is initiated, allowing near interactive probing of feanire. requirements and quick 
confirmation (or rejection) of the validity of tentative software fixes. 



1. Introduction 

Distributed systems software can be difficult to test The detailed behavior of such a system typically 
depends on subtle timings of events and on the relative speed of execution of concurrent processes. This 
means that errors, once they manifest diemselves, can be very hard to reproduce, and it means that when the 
system passes a series of tests, one cannot safely conclude that the same tests can never fail. The situation 
is further complicated when we deal with increasingly complex software widi multiple feanire packages, a 
phenomenon that most PC usen today will be guniliar with. The problem also exists in the systems code of 
large telephone switching systems. 

The best-known manifestatiOD of the problem in call-processing applications is the so<alIed feature inter- 
action probUm (KK98]. Telepbooe companies can compete with elaborate feature packages that are 
offered to cusMiien» ranging from standard features such as call forwarding, to more obscure variations 
like call parking. Hie number of distinct features offered on a main switch today can be well over one hun- 
dred. Each of these features can require a different response to the same basic set of events that can occur 
on a subscriber line* and thus the feature interaction problem is bom. With just 25 features there can 
ahieady be 2^ possible feature combinatiotis. If each combination could be tested in a second, it would 
take about a year to test all combinations. By any standard, this is an uixlesirable strategy. For a simple 
example of feature interaction, consider what the required response of the switch is if a customer has both 
an anonymous call rejection and a call forwarding feature enabled simultaneously. Which of these two fea- 
tures should take precedeixx when an anonymous call arrives for this customer? 

Formnately. in practice the siniation is not quite this bad. Tekrordia, the former BellCore, has issued stan- 
dards on feanire behavior that switch providers must comply with (B92-96). According to these standards, 
some feature combinadons are not allowed, some are non-conflicting, and for some a feature precedence 
relation is prescribed that determines which feature behavior is to take precedence in case of conflia (For 
the example above, for instance, the rules state that the anonymous call rejection feature should take prece- 
dence.) Unfortunately, the rules from die standards are not always complete, and they are sometimes hard 
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to interpftt unambiguously. Tbe cask of systematically checking if the Telcordia standards have been 
implemenced contcdy by a vendor of a switching system therefore remains formidable. 
Methods that can be used to mechanically verify distributed systems software should be of considerable 
value in industrial software design. Specifically, we are interested here in methods that can be used to for- 
mally verify the call processing software, and specifically the feature code, for a new commercial switch. 
We will describe a system named FeaVer, that can accomplish this feat 

Earlier attempts to apply automated verification techniques to distributed software applications generally 
have relied on hand crafted fonnai models, often produced by verification experts over a period of months 
in collaboration with the developers of an application, e.g. {CAB98](S98]. Because of tbe time required to 
construct formal models by hand, detailed changes in tbe source applicadon cannot easily be tracked with- 
out a significant reinvestment of time and energy. By eliminating the need for hand crafted models, tbe 
system we will describe can be used to verify virtually every version of an application, tracking tbe evolv- 
ing source throughout the design cycle. 

In the next few sections we will discuss the central components of the FeaVer feature verification system: 

• Mechanized model extraction: a method for mechanically extracting verificadon models from imple- 
mentation level code, controlled by a user-defined conversion table. 

• Formulating properties: defming the set of formal requirements that the applicadon has to satisfy. In 
our case many properties couki be derived from \bt Teteordia standards for call processing feature 
implementatkxi. Others define more specific local requirements or are more expk)ratory in naoire. 
Tbe use of a database of correctness properties is comparable to the use of test suites and test objec- 
tives in a traditional testing method. 

• Logic model checking: die method that is used to mechanically verify if the system satisfies or possi- 
bly can violate one or more of die stated requirements. 

• System support the mechanics of die verification process, including die use of a system of net- 
worked PCs, called TraiiBlazer, to execute verification jobs in parallel. 

We conclude die paper with a summary of our fmdings. 
2. Mechanized model extraction 

It is known diat it is not possible to devise an algohdim diat could prove arbitrary properties of arbitrary C 
or C-H- programs. It is not even possible to mechanically prove a single specifK, property such as program 
termination fix arbitrary programs [T36]IS65). So if we want to be able to render proofs, we have no 
choice but to restrict ourselves to a smaller class of programs. An example of such a class is die set of all 
fmtte state programs: programs diat on any given input can generate only a finite number of possible pro- 
gram states (i.e., memory oonfiguradoos) whra executed. We call a simplified program of diis type a 
modeL The set of aD possible executions for a fmite state model defines a fmite directed, and possibly 
cyclic, graph. Even without expUcidy constructing the complete graph, which can still be large, we can 
now reason about feasible and infeasible paths in the graph, and prove if certain executions are possible or 
not This is predsdy what a k>gic model checker is designed to do. 

The first problem to be solved diea is to reduce a given C or C++ program to a meaningful fmite state 
model diat can be analyzed Hie reduction will bring a loss of information, so it has to be chosen in such a 
way diat relevant information is preserved and irrelevant detail removed. What is 'relevant' and what is not 
•depends on die properties dial we are interested in proving about die program. If, for instance, die function- 
ing of die billing subsystem is not mentioned in any of die system requirements we check, dian all access to 
and manipulation of billing data can be stripped from die program to produce die model. Some care has to 
be taken, diough, to guarantee diat die removal of code preserves our ability to fmd all property violations. 
The following procedure wiU ensure diis. 

All assignments and function caUs diat have been tagged as irrelevant to die verification effort are replaced 
widi a skip (a dummy no-op in die modeling language). AU conditional choices diat refer to data objects 
tagged as irrelevant are replaced by nondeterministic choices. The use of nondeterminism is a standard 
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reduction technique that can be used to make a model more general, broadening its scope. The nondeter- 
minism tells the model checker that all possible outcomes of a choice shoukl be considered equally possi- 
ble, not just one specifically computed choice. The original computation of the system is preserved as one 
of the possible abstracted computations, and the scope of the_verification is therefore not restricted. If no 
property violation exists in the reduced system, we can safely conclude that no property violation can exist 
in the original application 

The reduction method is fail-safe* in the sense that if we chose the reduction incorrectly, the above result 
still hokJs true, although the reverse does not (AL911,(CGL941,[K95),(B99). It is possible, for instance, that 
the full expansion of an error trace for a property violation detected in the reduced system does not corre- 
spond to a valid execution of the original application. If this happens it constitutes a proof that information 
was inadvertently stripped from the system that was in fact relevant to the verification. In this case at least 
one of the conditional choices in the abstraa trace will turn out to be invalid in the concrete trace, not 
matching assigimients to data objects earlier in the trace. These data objects are now known to be relevant 
to be properdes being verified, and the reduction can be adjusted accordingly. Typically a few iterations of 
this type suffice to converge on a stable defmidon of an abstraction that can be used to extract a verifiable 
model from a program text as we will discuss in more detail below. 

The PathSUr Code 

In the verificadon of the code for the PadiStar access server (FSW981. shown in Figure 1, our focus is 
exclusively on the verificadon of telephony feaoires. Since we are not looking for faults in the sequendal 
Ij code of device drivers, process schedulers, memory allocation routines, billing subsystems, etc.. the func* 
y tion of such code can be abstracted. For device drivers, for instance, it means that the abstractions used do 
fg not enable us to check that a device driver administers dialtone correctly when given the appropriate com- 
O niand by the controller, but it does enable us to check that the controller can only issue the appropriate com- 
^ 1 mands when required, and cannot fail to do so [HS99]. 




Fig. 1 — The PathStar™ Access Server 
Providing data and voice service over a variety of media. 
In the PatiiStar code the function of the controller is specified in a large routine that defmcs the central state 
machine for all basic call processing and feature behavior. This routine, roughly 1,600 lines of C source, is 
executed concunenUy by a varying number of processes, joindy responsible for die handling of incoming 
and outgoing calls. In die extracted model for diis code, we carefully preserve aU concurrent behavior and 
the complete execution of die state machine, be it in slighdy abstracted form. 

Nondetenninistic test drivers in die model are used to generalize die behavior of aU parts of die system diat 
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aie external to tbe state madiioe: subscriber behavior, connected devices, remote switches. The source text 
of (he original program is preserved in the abstract model so that we easily reproduce a concrete trace from 
any abstraa error trace that is discovered. 

Wid) the reductioD process we have outlined, tbe control now of the onginai source is preserved in the 
reduced model. Data access, however, passes through a user-defined abstraction filter. This filter, defined 
as a conversion lookup table, determines which operations are irrelevant to the properties to be verified 
(e.g.. function calls for billing and accounting) and which need to be represented either literally, with an 
equivalent representation in the language of the model checker, oc in more abstraa form. Tbe irrelevant 
operations are mapped into the null operation of the model checker. 

Th« Conv#rsion Tabto 

From all different types of statements that appear in the PathStar call processing code, about 60% are 
mapped to an equivalent statement in \ht extracted model (i.e., they are (mserved in the abstracdon), cf. 
Figure 2. This includes all statements that cause messages to be sent Erom one process to anodier (like call 
requests, caU progress and call termination signals), and all statements that are used to record or test tbe call 
state of a subscriber. 

Tbe remaining statements and conditionals are abstracted in one of three ways, depending on their rele- 
vance to the verification effort 

1. A statement that is entirely outside tbe scope of the verification is replaced with skip and diereby 
stripped from the model, as also discussed above. This applies to about 30% of tbe cases. 

2. If a statement is partially relevant, the conversion table defines a mapping function that preserves 
only d>e relevant part and suppresses the rest For example, we do not use the absolute values of 
timers in our verifications. For die properties we define it suffices to know only if a timer is nmning 
or not. and dierefoit tbe integer range of possible timer values can be reduced with a mapping func- 
tion to tbt boolean values true and false: indicating whether or not a timer might expire. This map- 
ping could not be used if we were to include requirements on the real-time performance of the Padi- 
Star switch. Ober examples of this type of abstraction are cases where the details of an operation are 
irrelevant, but tbe possible outcomes are not For instance, digit analysis can be an involved operation 
diat is mosdy irrelevant to tbe functional correctness of the call processing code. Only relevant is that 
the controller deals correcUy with the possible outcomes of this operation: to respond properly when 
an abbreviated number or a feature access code is recognized, to start routing the call if it is deter- 
mined diat sufficient digits were collected, or to wait for the subscriber to provide more digits, with 
the proper timers set to guard die inter-digit timing incervaL In diis case tht conversion table replaces 
[he operation widi a nondetermiiustic choice of the possible outcomes. 

3. The diird type of abstractiOQ is used when an operation is fully relevant and needs to be preserved in 
die model with only syntax adjustments. 



Fig. 2 — Ratio of basic types of abstractions applied 
To track changes in die source texu and to retain the capability to extract models, we only have to keep die 
conversion table up to date, radier dian a fully detailed hand-crafted model. Some changes in die source 
require no update at all. This is die case, for instance, if code is copied or moved widiout die introduction of 
new types of data manipulation. When a new type of data access appears, die model extractor warns die 
user and prompts for a new entry in die conversion table. In most cases the new entry can be defmed widi- 
out knowing anything about die purpose of die change or it's impact on behavior. Typically, a week's 
worth of upgrades of die call j^ocessing code translates into ten minutes of work on a revision of the con- 
version table before a fully mechanized verification of all properties can be repeated. More detail on die 
defmition of conversion tables can be found in {HS991. 



Method 



Percentage of Code 



Ftdfy absmxcud (stripped) 

Functional and Non-deterministic abstraction (mapped) 
Not abstracted (preserved) 



30% 
10% 
60% 
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Assumpti ns about tho Environment 

The call processing code in PathStar interacts with a number of entities in its environment: subscribers, 
remote switches, database servers, and the like. The task of constructing precisely detailed behavior defini- 
tions for each of these entities would be both formidable and redundant IH97]. 



For each remote entity that interacts with the call processing controller it suffices to construct a small 
abstract model that captures a conservative estimate of the possible behaviors of these entities in a general 
way. Note that our objective is not to verify the correct behavior of the remote enbties, but that of our own 
switch despite the presence of possibly ill-behaved remote entities. The system requirements, test drivers, 
and the conversion map together define the verification context, as illustrated in Figure 3. 

We can, for instance, define an abstract model for generic subscriber behavior with a simple demon that can 
nondeterministically select an action from all the possible actions that a subscriber might take at each point 
in a call: going on- or off-hook, flashing the book, dialing feature access codes, etc. Similarly we can 
model the possible responses from a remote switch to call requests from the local controQer, using a demon 
that can generate possible responses nondeterministically. 

Abstracdons such as these, based on nondeterminism, achieve two objectives: they remove complexity by 
removing extraneous detail, while at the same time broadening the scope of the verification by representing 
larger classes of possible behavior, instead of seleaed instances of specific behavior. 

3. Formulating Properties 

The database for feature verification of the PathStar code that we have constructed contains approximately 
80 properties for 20 features. For each feature set the database further defines one or more provisioning 
constraints. When verifying the correa implementation of any given feature we must, for instance, be able 
to specify that the feature is to be enabled, and that incompatible features are to remain disabled. These 
additional constraints could be included in the defmitions of the properties themselves, but the extra infor- 
mation would hamper their readability. By decoupling provisioning detail from functional properties we 
can more easily experiment with different types of provisionings on a common set of properties. It is. for 
instance, possible to check the correct implementation of feature precedence relations by deliberately 
enabUng and disabling higher precedence features. In the absence of an explicit provisioning constraint, the 




Fig. 3 — Verification Context 
The verification context consists of a conversion map that defines the level of 

abstraction, test drivers that capture the essential assumptions about the 
environment, and a database of properties that define the system requirements. 
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model checker will assume no knowledge about tbe enabledness or disabledness of features, leaving this to 
Qoodetenninisdc choice. 

For each feature, each property is verified fot each related provisioning constraint For 80 properties this 
translates, in our current database, into approximately 200 separate verification runs. The number of runs 
to be performed for a full verification of the source code can change with the addition or deletion of proper- 
ties or provisioning constraints. ~ - -..^ ... _^sr -J- — - „ r 

An Example 

As a simple example we will consider tbe formalization of the requirement that whoi a non-ringing phone 
is picked up. dial-tone is generated Qearly, there are soote exceptions to this rule. When the line is provi- 
sioned as a hotline, with a direct call feature, or when tbe line is provisioned with tbe denial of originating 
service feanire* then no dialtone will be generated. To check this property, therefore, we need to defme a 
provisioning constraint that disables such higher priority features. 

One method to specify this property is to use a simple form of linear temporal logic. Temporal logic, intro- 
duced in the late seventies for tbe concise formulation of correctness properties of concurrent systems 
[P77], defines a small number of operators that allow us to reason about executions. In temporal logic tbe 
example property can be specified as follows: 

□ ( ofihook X 0 ( dialtone v onbook) ) 

In this case we allow for tbe possibility diat tbe subscriber returns the pb^ onbook befcm actually bearing 
tbe dialtone, which would of course be valid. Tbe truth value of a temporal formula is evaluated over exe- 
cution sequences. This means that if we evaluate the formula at any given point in a system*s execution it 
woukl return true if and only if the complete remainder of the execution from that point forward satisfies 
tbe property stated. 

Three unary temporal operators are used in tbe formula above: a (always), X (next), and 0 (eventually). □ p 
states that p is true now and will remain invariandy true throughout tbe rest of tbe computation. X p states 
that p will be true after tbe next execution step. 0 p states that p is either true now or it will become true 
within a fmite number of funtre execution steps. The right-arrow. denotes logical implication: (p q) 
means (-i p v q), where -i is logical negation and v is logical or. 

Tbe model checker will use this formula to check if there can be any system executions that would violate 
tbe property. This procedure works by first negating the formula, so that we get a formalization of a violat- 
ing execution. Tbe negated formula for die example can also be derived manually as follows, using stan- 
dard rewrite rules from boolean and temporal logic: 

-I □( ofDK)ok-> XO(dialtooev onbook) ) a 
0^(oShook-f X 0 ( dialtone V onbook) ) m 
0 ( -I of!book V X 0 ( dialtone v onbook) ) « 
0(ofDiookA-^ X 0 ( dialtone V onbook) ) « 
0 ( ofiOiook A X -1 0 ( dialtone v onbook) ) ■ 
0(ofDiookA Xa(-)dialtone A -.onbook) ) 

This negated formula can be converted mechanically [GPVW9S] into a 2-state o-automaton, illustrated in 
Figure 4. which is used in tbe model checking process. 

TInrMilna Edttor 

For tbe specification of complex behavior, e.g., to capture properties on the correct functioning of a six- way 
conference call an accurate formalization of die property in temporal logic can pose a challenge. We have 
therefore experimented with an alternative mediod for specifying properties using a simple graphical user 
interface. Though this form of property specification is not as general it covers many of tbe types of prop- 
erties we arc interested in. All properties specified in Uiis way can be translated mechanically into temporal 
logic formulae, or also direcdy into property automata for use in the model checking process. Figure 5 
shows the specification of the earlier property, checking for dialtone after an ofibook. 
The timeline editor allows the user to defme events that are part of tbe required behavior on a horizontal 
line. Most events are markers that are used to identify the execution sequences of interest These events 
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dialton« && onhook 

Fig. 4 — Property Automatoo. 
This o}-automaton (side box B) is automaxically extracted from a temporal logic formuUi It defines 
the set of behaviors that would violate the property, and is used in the verification process much like a 
pattern, * to search the set of all possible system executions for matches. 




lorhook 



ii 



Rg. 5 — Timeline Editor 
The timeline editor provides an intuitive alternative method for property specification. The user 
defines events and a test target, and mark multiple, possibly overlapping, intervals with constraints. 
are labeled with an e in the diagram. Other events art required to appear in response to the earlier events. 
These required events are marked with an r (not shown in Figure S). The last event of the sequence is 
always required and is marked with a t (for target). The model checker will flag an error if it can construa 
an execution sequence in which the markers (e) are present, but one or more of the required events (r and t) 
are missing. 

The timeline editor also allows us to state that certain events must be absent for the execution to be of inter- 
est In this case, this applies to onhook events. These conditions are specified as constraints on the execu- 
tion, using a borizooial bar under die timeline to identify the precise pan of the execution to which the con- 
straint applies. For events or conditions that are not mentioned as events or in constraints, no restrictions 
apply. 

The property definition shown in Figure 5 is automatically converted into the same automaton as shown in 
Figure 4, so the two methods of specification yield identical checks in this case. 

4. Logic Moctol ChMMng 

The formal models extracted from the source of the application are specified in the language of the LTL 
model checker SPIN (H971. SPIN models define the behavior of systems of asynchronous processes that 
can communicate via message charmels, rendezvous ports, or via shared data. SPIN converts die input 
specification into a product of automata. The global behavior defined by this product can be checked effi- 
ciently for a wide range of correctness properties using an automata theoretic oxxkl checking procedure 
(VW86). To perform the check, SPIN starts by computing an automaton that captures all possible viola- 
tions of a given correctness property. If the property is specified as a formula in linear temporal logic 
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Rg. 6 — Overview of the Model Checking Process 
[P77], for instance. SPIN takes the logical negation of the formula and converts it into a test automaton 
using the procedure defined in [GPVW95]. This automaton formalizes system executk>ns chat should not 
be feasible if the application is designed correctly. Next SPIN searches the intersection produa of the lan- 
guage defmed by the negated property automaton and the language defined by the automata that was 
extracted from the system. If the intersection produrt is empty, no violatioos of the property are possible 
and the system passes the test If the intcrsectk>n product is not empty it contains at least one complete exe- 
cution that is both in the lanuage of the system (i.e., it is a possible execution of the system as specified) 
and in the language of the negated property (i.e., it constitutes the violation of a property). In this case 
SPIN will generates that execution sequence as proof that the property can be vwlated. In the FeaVer sys- 
tem die sequence is converted back into the source language of the application, using a reverse lookup in 
the table diat was used to extract the formial model from the source of the application. The verificadon pro- 
cess is illustrated in Figure 6. 



5. System Support TraUBIazer 

The main interfece to the feature verification system is a standard web-browser. Through the browser the 
user can che<* on die verification status of all properties, lookup the text and the justification of each prop- 
erty, refer go the source text of die Tekwdia feature requirement documents, and inspect reported error 
seqiienccs in a number of different formats. An error sequence can be displayed as a message sequence 
Chan in eidw ASCII or grapUcai form (Figure 7). or it can be displayed as a detailed dump of a series of 
concurrent executioQ traces inicricaved in time, wiOi one trace for each of die processes diat participated in 
die failed execution. Tlie detaikd execution traces list aU concrete C statements and conditions that arc 
executed or evaluated during d« execution, in time sequence. TypicaUy such a sequence reveals subUe 
race conditions in die interleaving of actions dial can lead to faults. 

The main pieces of die infrastructure for die checking process: test drivers, die conversion lookup table, and 
die supporting text for properties are created and maintained widi a standard text editor. 
TTie source code of die application is maintained by die developers and parsed direcdy by die FeaVer soft- 
ware when a verification run is initiated. 

Verification runs are always initiated by die user dirough die web interface. It would also be possible to 
automaticaUy trigger a comprehensive series of checks each time diat die FeaVer system detects diat eidier 
die source of die application or die text of a property has changed, say in die earty morning hours of every 
day. So far, however, we have not used diis inirigueing possibibty. 
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Fig. 7 — A Sample Property Violatkm 
A sample execution sequence, shown here in graphical form as a message sequence chart, presented 
by the verification system as proof that executions are possible in which dialtone is not generated 
In this case the property violation can occur if the subscriber has call forwarding, and happens to 
pick up the phone precisely when an incoming call is being forwarded. In an unlikely scenario, the 
call processing software can be made to delay the generation of dialtone arbitrarily long while the 
system is rejecting or forwarding more incoming calls. When the calls stop, the system will eventually 
timeout and deliver dialtone (not shown he re I The scenario can also be presented as a trace ofC 

statement executions. 

To initiate a check* tbe user selects ooe or mort properties and provisiOQing constraints from tlie web inter- 
face, and initiates tbe check with tbe click of a buttoo. Tbe remainder of this section describes what bap- 
pens when tbe button is pushed. Tbe tasks to be performed in the verifkatioo process are divided over a 
number of server applications diat can nm anywhere in the network. This capability to spread the work 
over several machines allows us to exptoit large numbers of indepaident processors to assist in the execu- 
tion of verificadon tasks. ILe additional processon art not necessary for FeaVer to perfonn its tasks, buu 
they can provide sigmficant speedups. Tbe network of processors thai we have assembled for this purpose 
is called TrailBlazer (Figure 9). Four basic types of servers together im>vide the required functionality 
tbjxep, tb^sched, tb^exec, and tb_wrap, as illustrated in Figure 8 and explained in more detail below. 

1. Tbe FeaVer web browser sends a request to initiate ooe or more verification runs to a server called 
tb j>rep. This server receives the property and provisioning information diat the user provided and 
starts tbe process. 

It calls a program called pry to parse the C code of the application, identify the state routine, and con- 
ven it into an intermediate format, organized as state, event transition triples. Another program, 
called catch then parses diis intermediate fonnat and generates a SPIN verificatioQ model using the 
conversion map. Catch adds the user defined test drivers (defining context), suitably translated pro- 
visioning information, and tbt property, after converting it into automata form. On average there are 
two local states in the SPIN model for every line of source code in die application. Tbe final model 
defines die behavicH* of 7 different types of processes (several of which are used to create multiple 
independent processing threads), 10 buffered message channels, and approximately 100 variables. 
Tbe model is constructed in less dian a second. 
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Fig. 8 — Servers and Workflow in ibe TrailBlazer System 
When the user presses the * check' button on the web browser a sequence of steps is executed 
to mechanically verify the properties selected Negative results of the verification typically 
flow back to the user within minutes after a check is initiated, 
Tb jxep also generates a script that can be used to generate C code for a dedicated verifier for tbe 
model ttiat was produced, and to compile and run tbat code. It now hands hands over the task to a 
cratral task scheduling server, tb^sched by sending it a to tbe job file. 

2. Tb^sched coUects tbe infonnatioo and adds it to its table of tasks to be completed This server aiso 
collects offers to execute jobs from arbitrary workstations in our network. To make such an offer tbe 
workstation runs a small server program caUed tb^axec. Tbe volunteering workstation can run any 
type of operating system. Tbe FeaVer servers, for instance, rmi under WindowsNT. and a number 
of dedicated PCs tbat act as compute-servers run under tbe PIan9 operating system [P9S1. Fifteen 
PCs, shown in Figure 9, are permanendy allocated to nm verification jobs. 

3. When tbe scheduler tb^schad idratifies an available workstatkxi it sends tbe corresponding server 
tb exec a job script, with informatioo on where any dependoit infonnadon (e.g., files to be com- 
piied) can be retrieved. Tlie scheduler will attempt to have as many task performed in parallel as is 
possible, without overloading any one of tbe workstations. Typically no new job is assigned to a 
workstackSQ until tbe previous one has completed. Tbe search itself is performed with an iterative 
search proce du re tbat optimizes our chances of finding errors quickly (side box A). 

4. When a wodotation completes a task it signals its renewed availability to tbe scheduler tb_schad 
and forwards tbe results of tbe run to the last server, on the FaaVar system, that performs postpro- 
cessing: tb^wrap. 

5. Tb wrap produces die ASCII and graphical format for error sequences, and generates detailed C 
traces. If no error was found, some statistics on die run are collected, to allow die user to judge tbe 
validity of tbe result (see Avokiing falM positivM). Tbe statistics include die coverage of the prop- 
erty automaton, and die coverage of die model code as a whole. Tbe information is entered in die 
database, and linked to die corresponding properties, so diat it becomes accessible to die user via die 
web browser interface. 



-17- 



Hoizmann 17 




Fig. 9 — TrailBlazef Compute-Servers 
Fifteen standard PCs. running the Plan9 operating system as 
compute servers, give the TrailBlazer system a performance boost. 

Tracking Progress 

When a comprehensive verification cycle is started for all properties that have been defmecL for instance 
after an update of the source text of the application, it is of great interest to know immediately when an 
error sequence for a property has been discovered, so that it may be inspected. Typically this happens 
within the Hrst few tninutes of a comprehensive run, but it is of course not known in advance which proper- 
ties might fail. The job scheduler tb sched knows when the processing of an error sequence was com- 
pleted, and can prompt the user, pointing at a URL where detailed information on the error sequence can be 
found, through the standard web browser. There is also a visual tracker, written in Tcl/Tk (094 that 
shows the progress of the search with color bars, one for each property being verified. The bar turns red as 
soon as an error sequence has been discovered for the corresponding property. By clicking the bar. the 
detailed information on the sequence can be brought up in a web browser. This application is illustrated in 
Figure 10. 

Separately, another small Tcl/Tk application can be used to track the actions of the scheduler: showing 
visually which machines have volunteered to execute jobs, which have been assigned a job and what the job 
details are, as illustrated in Figure 11. 

Avoiding FalM PositlvM 

From the point of view of the verification system, the best possible outcome of a verification anempi is the 
generation of an error sequence. There is a possibility, if the abstraction in the conversion table was chosen 
incorrectly, that the sequence is invalid and constitutes, what is called, a false negative. By inspecting the 
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Rg. 10 — Property Verification Tracking 
A list of all properties included in the current verification run is displayed In the figure up to 
eight cycles in the iterative search refinement process are executed The search stops, marking the 
progress line in red as soon as an error is found 
sequence this can usually be detennined quiddy. and tbe abstractioo can be adjusted to prevent reoccur- 
rences. Tbe absence of enors occurs wben d>e application EaitbfuUy satisfies tbe property, but in this case it 
is possible that the property itself was in fact inadequate. TUs is called a vacuous property, cf. (KV99], or 
false positive, and it is addressed differendy. 
Consider a property of tbe type we discussed earlier 
a(p-^X(Oq)) 

It states duu whenever a trigger condition p occim then sometime thereafter, within a finite number of exe- 
cution steps, a response q will follow, where q itself can either be a proper response or a discharge condi- 
tion that voids tbe need for a response (e.g., an onhook event that voids the need for a diaitone signal). If 
all is well, there will be execations in which occurs at least once. If there are no executions possible in 
which p occurs, then the formula is logically satisfied (note that the condition □ (p q) is satisfied when p 
is invahantiy false). But even dKXigh the formula is strictly satisfied, it is almost certainly not what the user 
intended. Tbe telltale sign of diis false positive can be found in the number of states reached during the 
check for the automaton diat corresponds to this property. In die case above, the property automaton never 
leaves its initial state. This occuncoce can easily, and mechanically, be detected, so that the user can be 
warned to change the formulation of the property into a more meaningful one. 

Because properties that do not generate error sequences can take the longest rtm times (i.e., they will pass 
through all iterative passes of the scheduler), the user can ask the scheduler to provide statistics on the runs 
that have been completed for a property. If the first few approximate runs for a property all leave the prop- 
erty automaton in its initial state, strong evidence that the property is void can be available within the first 
few minutes of the verification, and it is not necessary for the user to wail for the complete verification 
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Fig. 11 — System Status Tracking 
Optionally the user of the FeaVer system can track the status of the workstations that have volunteered 
to run verification jobs. A color code identifies which of these workstations are currently free ( green 
which are busy (blue or yelbwK which are dead (red). To the left of busy workstations is briefly indicated 
which job they are currently executing: a compilation (yellow) or a verification (blue). 



process to terminate. 



AsMSsni«nt 

By concentrating on the central portion of tbe control code for call processing in PathStar, we were able to 
p^orm unusually thorough checks of criticai system properties. This narrow focus, however, also pushed 
some interesting types of properties outside our reach. The main burden on the user of this checking pro- 
cess is the definitioa of meaningful properties, not on the mechanics of the checking process itself. The use 
of temporal logic can be a stumbling block, even for experienced users. To try to remedy this we developed 
the simple dmeline editor, that is able to express the majority of the properties of interest in a fairly intu- 
itive way. The vacuity check on apparently positive results from a verification run has also proved to be 
essential: it can be easy to state complex properties that are in retrospect meaningless. We built some 
mechanized checks to warn the user of such occurrences. 

In a production environment, with strict projea deadUnes, the likelihood that a system error is addressed 
quickly is often inversely proportional to the amount of text that is needed to describe it. Execution 
sequences of thousands of steps that putaiively violate complex logic properties are not likely to get quick 
attention. We therefore use the verification system in two phases: the first phase is used to identify all pos- 
sible property violations, and the second phase is used to to generate the shortest possible example of each 
violation discovered, selecting the most likely manifestation of the error. This strategy has proven success- 
ful. In most cases an error can be demonstrated in no more than ten to fifteen steps, whereas an initial error 
sequence might contain hundreds and risk escaping notice. 
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6. Conclusions 

At the dme of writing, we have tracked the design, evolutioiu and maintenance of the PathStar call process- 
ing code over a period of approximately 18 months. In this period, the code grew fivefold in size and went 
through roughly 300 different versions, often changing daily. We intercepted approximately 75 errors in 
the implementation of the feature code by repeated verifications. Many of these errors were considered 
critically important by the pfogranuners, especially in the eariy phases of the design. About S of the errors 
caught were also found independenUy by the normal system test group, especially in the later phases of the 
project (The traditional testing, of course, addressed the PathStar system as a whole, and did not concen- 
trate solely on the call processing code as we did.) In about S other cases the testers discovered an error 
that should have been within the domain of our verifications. These missed errors were caused by unstated 
or ambiguous system requirements; once the proper requirements were added into our database, the viola* 
tions were caught 

[laws can get enter into source code in the initial design stages, but also, and perhaps more frequently, dur- 
ing routine system maintenance. A portion of routine bug fixes will introduce new bugs into the code. The 
ability of the FeaVer system to repeat comprehensive verification runs immediately after bug fixes are 
made is therefore of great value. New incidences of property violations can be trapped instantiy. while tbe 
rationale for a code change is still firesh in the mind of the developer. 

In several cases we used our verification system in an unexpected way as a diagnostic tool Occasionally 
the testers would run into a problem tiiat could not be reproduced By feeding die event sequence of such a 
test into die FeaVer system the error sequence could be reproduced in these cases and snidkd to determine 
which race conditions or event timings were responsible for its occun«ioe. In odier cases tbc prognunmers 
of die system wanted to confirm tiieir inniition about die occurrence or absence of certain conditions, such 
as a suspected unreachability of part of die code. The verification framework proved ideal to settie such 
question promptly. 

The mediod of verification we have outlined in diis paper should be generally applicable to distributed sys- 
tems code written in most programming languages. Our aim in die coming years is to apply the mediod to 
a diverse set of applications, so diat die checking process can be streamlined and made available for general 
use. 
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SM Box A: lt«nrtiv SMrch ProMdure 

Spin provides support for performing verification jobs either exactly or witb varying degrees of precision, 
using proof approximatioa techniques. The benefit of an exact run will be clear. An exact verification, 
however, can be time consuming for larger problems. An approximate answer that can be delivered quickly 
is often of tnore value to a user than a precise answer thai takes much longer to compute. In an approxi- 
mate verification both the quality and the speed of the run can be controlled with a parameter. As the thor- 
oughness of the run increases or decreases, so do the time requirements. An approximate verification in 
effea performs a random sampling of the behavior of a system, in an effort to fmd violations of system 
requirements. With this technique we can fmd a practical compromise between verification and testing. The 
coverage that is achieved in even approximate verification runs is significantly larger than what is achiev- 
able with traditional testing techniques. 

Our experience is that when property violations are possible, even very approximate verificadon runs can 
identify them. Hie verification system uses diis fact to enforce an iterative search refinement method for 
each verificadon task. The iterative search procedure starts by allocating the fastest possible, and most 
approximate, runs for each task. The first of these runs typically completes in under a second of CPU time. 
If the run fmds an error, the remainder of the runs can be abandoned, and the error sequence can be pro- 
cessed for inclusion in the FeaVer database. If no error is found, the coverage is increased. The second run 
tnay take two seconds, proceeding to four, eight, sixteen second runs, and so on undl either an error is 
found or maximal coverage was reached (and with that proof that no violations of the corresponding prop- 
erty are possible). 

With, say, 200 verification runs to be perfOTned and 20 workstations available to perform the runs, the first 
approximate runs can be completed in about ten seconds for all jobs combined In each new iteration ail 
jobs that produced errors are deleted from the workset, and the scan beccxnes more diorough for the remain- 
ing properties. With this prxedure it typically takes a few minutes to identify the first property violation in 
a large set of verification tasks. After about five minutes a representative selection of violations is normally 
available, with the gaps filled in in subsequent searches. The iterative search procedure is abandoned after 
about an hour, whether it has reached fully exaa results or not The rationale is that widiin an hour one 
normally will have looked at die property violations and formulated corrections of the source code. Further 
error sequences wouki be of little use, since the source code has by now changed, and more value can be 
derived from a new scan of all properties for the new version of die source. 

Once all verification tasks have been completed, an optional second phase of the verification is performed 
by reinspecting every error sequence found and, again iteratively, searching for a shorter equivalent (see 
also Assessment). 

Side Box B: Omege Automate 

The formal definition of an a>-automaton, as shown in Figure 4, differs slightly from that of a standard 
fmite automacoo. Instead of accepting (input) sequences of fmite lengdi, like a standard finite automaton, 
an oo-automafiOQ accepts only sequeiKxs (in our case representing system executions) of infmite length. 
There are several ways to define the acceptance conditions for an o-automaton [T90]. The defmition used 
in SPIN is known as Buchi acceptance. It states dHat a sequence is accepted if and only if it visits at least 
one accepting state in tht automaton (indicated widi a double circle in Figure 4) infinitely often. 
The automaton in Figure 4 is also noo-deterministic, which makes it's behavior less obvious. In die model 
checking process, die transitions of diis automaton are 'matched' one by one against die execution steps of 
the system. Execution starts widi die property automaton in its initial state Sq, After each step of die system 
the property automaton is forced to make a transition. To do so it can choose only from transitions with a 
label diat evaluates to true at diis point in die execution. The self-loop on 5o in Figure 4 can always be tra- 
versed, since it's label necesarily evaluates to true. The transition from 5o to Ji can only be taken when an 
offhook is deteaed. Note carefully dial i/an offliook is detected die property automaton can eidier stay in 
5o. and ignore this event, or move to J i and start die wait for a dialtone (i.e., it makes a non-deterministic 
choice). The verifier will check die consequences of eidier choice. The latter is important because we want 
to make sure diat every occurrence of an oflhook is followed by a dialtone, not just die first occurrence. 
Once die property automaton reaches state Ji it can only remain diere in die absence of dialtones and 
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onbooks. A sequence is fCHmaUy accepted by tbe automaton if and only if it is possible for tbe property 
automaton to remain in Si forever, infinitely often traversing tbe self-loop on that state. If a dialtone is 
detected witbin a finite number of steps, tbe attempt to matcb tbe corresponding execution to tbe property 
automaton fails, which means that the execution satisfies tbe original requirement and does not constitute a 
violation. Tbe matching process stops at this point. Tbe model cbecker will abandon tbe search of this exe- 
cution and explore other possible executions instead, in a hunt for possible violadons. There need not be a 
back-edge from state Sx to state Sq, since all behaviors of interest are already captured by the non- 
determinism on Sq (see above). 



